Stochastic Density Ratio Estimation and Its Application to Feature Selection

نویسنده

Igor Braga

چکیده

In this work, we deal with a relatively new statistical tool in machine learning: the estimation of the ratio of two probability densities, or density ratio estimation for short. As a side piece of research that gained its own traction, we also tackle the task of parameter selection in learning algorithms based on kernel methods. 1 Density Ratio Estimation The estimation of the ratio of two probability densities r(x) = p1(x) p2(x) is a statistical inference problem that finds useful applications in machine learning. Several approaches have been proposed and studied for the direct solution of the density ratio estimation problem, that is, to estimate the density ratio without going through density estimation [Sugiyama et al., 2011, and references therein]. By avoiding taking the ratio of two estimated densities, we avoid a dangerous source of error propagation. Next, we introduce situations where density ratio estimation naturally arises. Covariate-shift adaptation. Under the hood, most supervised learning algorithms apply the so-called Empirical Risk Minimization — ERM — principle, which selects a function f∗ n from a given set of functions F that minimizes the average of a loss function L : R × R 7→ R over a given set of training points {(x1, y1), . . . , (xn, yn)}. Formally:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improvement of effort estimation accuracy in software projects using a feature selection approach

In recent years, utilization of feature selection techniques has become an essential requirement for processing and model construction in different scientific areas. In the field of software project effort estimation, the need to apply dimensionality reduction and feature selection methods has become an inevitable demand. The high volumes of data, costs, and time necessary for gathering data , ...

متن کامل

Developing a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression

Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With ...

متن کامل

Bridging the semantic gap for software effort estimation by hierarchical feature selection techniques

Software project management is one of the significant activates in the software development process. Software Development Effort Estimation (SDEE) is a challenging task in the software project management. SDEE is an old activity in computer industry from 1940s and has been reviewed several times. A SDEE model is appropriate if it provides the accuracy and confidence simultaneously before softwa...

متن کامل

Direct Density-Ratio Estimation with Dimensionality Reduction via Hetero-Distributional Subspace Analysis

Methods for estimating the ratio of two probability density functions have been actively explored recently since they can be used for various data processing tasks such as non-stationarity adaptation, outlier detection, feature selection, and conditional probability estimation. In this paper, we propose a new density-ratio estimator which incorporates dimensionality reduction into the densityra...

متن کامل

Direct Density Ratio Estimation with Convolutional Neural Networks with Application in Outlier Detection

Recently, the ratio of probability density functions was demonstrated to be useful in solving various machine learning tasks such as outlier detection, non-stationarity adaptation, feature selection, and clustering. The key idea of this density ratio approach is that the ratio is directly estimated so that difficult density estimation is avoided. So far, parametric and non-parametric direct den...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Stochastic Density Ratio Estimation and Its Application to Feature Selection

نویسنده

چکیده

منابع مشابه

Improvement of effort estimation accuracy in software projects using a feature selection approach

Developing a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression

Bridging the semantic gap for software effort estimation by hierarchical feature selection techniques

Direct Density-Ratio Estimation with Dimensionality Reduction via Hetero-Distributional Subspace Analysis

Direct Density Ratio Estimation with Convolutional Neural Networks with Application in Outlier Detection

عنوان ژورنال:

اشتراک گذاری